Principal Methods
3
1.1.2
Gradient Approximation
As described in Section 1.1.1, while updating the parameters in BNNs and 1-bit networks,
the full-precision weights are updated with the gradient
∂C
∂ωb . But forward propagation has
a sign function between full-precision weights and binarized weights. In other words, the
gradient of the sign function should be considered when updating full-precision weights.
Note that the derivative of the sign function keeps zero and only becomes infinity at zero
points, and a derivable function is widely utilized to approximate the sign function.
The first one to solve this problem in a 1-bit network is BinaryNet [99]. Assuming that
an estimator of gq of the gradient ∂C
∂q , where q is Sign(r), has been obtained. Then, the
straight-through estimator of ∂C
∂r is simply
gr = gq1|r|≤1,
(1.3)
where 1|r|≤1 equals 1 when |r| ≤1. And it equals 0 in other cases. It can also be seen
as propagating the gradient through the hard tanh, which is a piecewise-linear activation
function.
The Bi-real Net [159] approximates the derivative of the sign function for activations.
Unlike using Htanh [99] to approximate the sign function, the Bi-real Net uses a piecewise
polynomial function for a better approximation.
Bi-real Net also proposes a magnitude-aware gradient for weights. When training BNNs,
the gradient ∂C
∂W is only related to the sign of weights and is independent of its magnitude.
So, the Bi-real Net replaces the sign function with a magnitude-aware function.
Xu et al. [266] use a higher-order approximation for weight binarization. They propose
a long-tailed approximation for activation binarization as a trade-offbetween tight approx-
imation and smooth backpropagation.
Differentiable Soft Quantization (DSQ) [74] also introduces a function to approximate
the standard binary and uniform quantization process called differentiable soft quantization.
DSQ employs hyperbolic tangent functions to gradually approach the staircase function for
low-bit quantization (sign function in 1-bit CNN). The binary DSQ function is as follows:
Qs(x) =
⎧
⎨
⎩
−1,
x < −1
1,
x > 1
stanh(kx),
otherwise
,
(1.4)
with
k = 1
2log( 2
α −1), s =
1
1 −α.
(1.5)
Especially when α is small, DSQ can closely approximate the uniform quantization
performance. This means that a suitable α will allow DSQ to help train a quantized model
with higher accuracy. Note that DSQ is differentiable, and thus the derivative of this function
can be used while updating the parameters directly.
According to the above methods, we can summarize that they all introduce a different
function to approximate the sign function in BinaryConnect so that the gradient to full-
precision weights or activations can be obtained more accurately. Therefore, the BNN or 1-
bit network converges easier in the training process, and the network performance improves.
1.1.3
Quantization
BinaryConnect and BinaryNet use simple quantization methods. After the full-precision
weights are updated, the new binary weights are generated by taking the sign of real-value
weights. But when the binary weights are decided only by the sign of full-precision weights,